Comparison of HMM and DTW methods in automatic recognition of pathological phoneme pronunciation

نویسندگان

  • Robert Wielgat
  • Tomasz P. Zielinski
  • Pawel Swietojanski
  • Piotr Zoladz
  • Daniel Król
  • Tomasz Wozniak
  • Stanislaw Grabias
چکیده

In the paper recently proposed Human Factor Cepstral Coefficients (HFCC) are used to automatic recognition of pathological phoneme pronunciation in speech of impaired children and efficiency of this approach is compared to application of the standard Mel-Frequency Cepstral Coefficients (MFCC) as a feature vector. Both dynamic time warping (DTW), working on whole words or embedded phoneme patterns, and hidden Markov models (HMM) are used as classifiers in the presented research. Obtained results demonstrate superiority of combining HFCC features and modified phoneme-based DTW classifier.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Automatic recognition of pathological phoneme production.

OBJECTIVE Proper diagnosis and therapy of pathological pronunciation of phonemes play an important role in modern logopedics. To enhance the efficiency of diagnosis and therapy an automatic recognition of pathological phoneme pronunciation is addressed in this paper. The authors focus on the therapy of phoneme substitution disorders. PATIENTS AND METHODS Recognized speech samples come from sp...

متن کامل

Improving Phoneme Sequence Recognition using Phoneme Duration Information in DNN-HSMM

Improving phoneme recognition has attracted the attention of many researchers due to its applications in various fields of speech processing. Recent research achievements show that using deep neural network (DNN) in speech recognition systems significantly improves the performance of these systems. There are two phases in DNN-based phoneme recognition systems including training and testing. Mos...

متن کامل

Pronunciation assessment via a comparison-based system

In this paper, we present preliminary results on applying a comparison-based framework to the task of pronunciation scoring. The comparison-based system works by aligning a student’s utterance with a teacher’s utterance via dynamic time warping (DTW). Features that describe the degree of mis-alignment are extracted from the aligned path and the distance matrix. We focus on a dataset in Levantin...

متن کامل

Automatic Scoring of Shadowing Speech Based on DNN Posteriors and Their DTW

Shadowing has become a well-known method to improve learners’ overall proficiency. Our previous studies realized automatic scoring of shadowing speech using HMM phoneme posteriors, called GOP (Goodness of Pronunciation) and learners’ TOEIC scores were predicted adequately. In this study, we enhance our studies from multiple angles: 1) a much larger amount of shadowing speech is collected, 2) ma...

متن کامل

Reaction Time in Phoneme Recognition: A Comparative Study among Iranian Upper-Intermediate vs. Advanced EFL Learners at Institute Level

The present study aimed to investigate of reaction time in terms of phoneme recognition: A comparative study among Iranian Upper-Intermediate vs. Advanced EFL Learners at Institute level. The main question this study tried to answer was whether there is no difference in reaction time in terms of phoneme recognition in Iranian learners at Institute level. To answer the question, 5Upper-Intermedi...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007